-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Glossary of Terms to Understanding Airbyte #6235
Conversation
|
||
**Extract**: Retrieve data from a [source](../integrations/sources), which can be an application, database, anything really. | ||
|
||
**Transform**: Clean up the data. This is referred to as [normalization](./basic-normalization.md) in Airbyte and involves [deduplication](./connections/incremental-deduped-history.md), changing data types, formats, and more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you put Transform 3rd in the list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
|
||
### Full Refresh Sync | ||
|
||
A **Full Refresh Sync** will attempt to retrieve all data from the destination every time a sync is run. Then there are two choices, **Overwrite** and **Append**. **Overwrite** deletes the data in the destination before running the sync and **Append** doesn't. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A **Full Refresh Sync** will attempt to retrieve all data from the destination every time a sync is run. Then there are two choices, **Overwrite** and **Append**. **Overwrite** deletes the data in the destination before running the sync and **Append** doesn't. | |
A **Full Refresh Sync** will attempt to retrieve all data from the source every time a sync is run. Then there are two choices, **Overwrite** and **Append**. **Overwrite** deletes the data in the destination before running the sync and **Append** doesn't. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good catch, fixed!
|
||
### Incremental Sync | ||
|
||
An **Incremental Sync** will only retrieve new data everytime the a sync occurs. The first sync will always attempt to retrieve all the data. If the [destination supports it](https://discuss.airbyte.io/t/what-destinations-support-the-incremental-deduped-sync-mode/89), you can have your data deduplicated. Simply put, this just means that if you sync an updated version of a record you've already synced, it will remove the old record. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An **Incremental Sync** will only retrieve new data everytime the a sync occurs. The first sync will always attempt to retrieve all the data. If the [destination supports it](https://discuss.airbyte.io/t/what-destinations-support-the-incremental-deduped-sync-mode/89), you can have your data deduplicated. Simply put, this just means that if you sync an updated version of a record you've already synced, it will remove the old record. | |
An **Incremental Sync** will only retrieve new data from the source every time the a sync occurs. The first sync will always attempt to retrieve all the data. If the [destination supports it](https://discuss.airbyte.io/t/what-destinations-support-the-incremental-deduped-sync-mode/89), you can have your data deduplicated. Simply put, this just means that if you sync an updated version of a record you've already synced, it will remove the old record. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
|
||
### DAG | ||
|
||
DAG stands for **Directed Acyclic Graph**. It's an overly fancy term originally coined by math graph theorists that just describes a tree-like process. For example, in the following diagram, you start at A and can choose B or C, which then proceed to D and E, respectively. This kind of structure is great for representing workflows and is what tools like [Airflow](https://airflow.apache.org/) use to orchestrate the execution of software based on different cases or states. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DAG stands for **Directed Acyclic Graph**. It's an overly fancy term originally coined by math graph theorists that just describes a tree-like process. For example, in the following diagram, you start at A and can choose B or C, which then proceed to D and E, respectively. This kind of structure is great for representing workflows and is what tools like [Airflow](https://airflow.apache.org/) use to orchestrate the execution of software based on different cases or states. | |
DAG stands for **Directed Acyclic Graph**. It's a term originally coined by math graph theorists that describes a tree-like process. For example, in the following diagram, you start at A and can choose B or C, which then proceed to D and E, respectively. This kind of structure is great for representing workflows and is what tools like [Airflow](https://airflow.apache.org/) use to orchestrate the execution of software based on different cases or states. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should mention that there can not be a loop inside
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to make things simple but we shouldn't belittle definitions (overly fancy
)!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated it - I'll hold back my pot shots at terminology :)
This is only relevant for individuals who want to create a connector. | ||
{% endhint %} | ||
|
||
This refers to how you define the data that you can retrieve from a Source. For example, if you want to retrieve `Account` data from your [Salesforce Source](../integrations/sources/salesforce.md), it needs to be defined clearly so that it can be translated to the destination. Learn more [here](./beginners-guide-to-catalog.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could to introduce the concept of schema. that would make the defintiion clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I've updated it with a definition I think non-technical people can understand too.
docs/SUMMARY.md
Outdated
@@ -193,6 +193,7 @@ | |||
* [Templates](contributing-to-airbyte/templates/README.md) | |||
* [Connector Doc Template](contributing-to-airbyte/templates/integration-documentation-template.md) | |||
* [Understanding Airbyte](understanding-airbyte/README.md) | |||
* [Glossary of Terms](understanding-airbyte/glossary.md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should probably be at the bottom of the list since it's more of a reference rather than something that people read end-to-end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, done.
@@ -0,0 +1,54 @@ | |||
# Glossary of Terms | |||
|
|||
### ETL/ELT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put this in alphabetical order
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep done.
Main Changes